1) Introduction
A customer with a worldwide presence (50 000+ users) using an obsolete and distributed mix of Sametime V6.5 and Sametime V3.1 servers environment took the decision of refreshing its environment and used this opportunity to consolidate servers.
- Deployment of Sametime® V8.0.1 for Chat and Sametime® Enterprise Meeting Server (EMS) V8.0.1 for Web meetings.
- Server consolidation in a single European datacenter.
This document describes the new consolidated Sametime® V8.0.1 and the configuration tuning we had to perform on this environment.
2) The initial Sametime V6.5 and V3.1 infrastructure
Figure 1
The following table lists the servers used in this initial infrastructure
Geography | Server Role | OS | Installed Software |
Asia Pacific (APAC) | Sametime AP | Win2000 | Domino 6.0 + Sametime 3.1 |
EMEA | Sametime EMEA IM | Win2000 | Domino 6.5 + Sametime 6.5 |
Sametime EMEA Web conf | Win2000 | Domino 6.5 + Sametime 6.5 |
North Américain (NA) | Sametime SA | Win2000 | Domino 6.5 + Sametime 6.5 |
South America (SA) | Sametime SA | Win2000 | Domino 6.5 + Sametime 6.5 |
The Web meetings were replicated between the APAC, NA, SA and EMEA servers, in order to optimize the WAN bandwidth utilization.
The customer had two main issues with its legacy environment :
- Reliability issues linked to hardware and software obsolescence.
- Difficulties to operate consistenly servers spread in 4 different geographical zones.
In order to resolve these issues, the customer decided to refresh its Sametime infrastructure.
3) The new Sametime® V8.0.1 infrastructure
The main architectural decisions for the New infrastructure were :
- Upgrade to Sametime version 8.0.1.
- Deployement of High availability configurations for both Instant Messaging and Web conferencing environments.
- Consolidation of all servers into the EMEA datacenter.
- For Instant messaging, use of the Sametime Connect client for Browsers (Javaconnect applet). The deployment of the sametime connect rich client was planned for a later phase.
- Instant Messaging cluster with 2 dedicated MUX servers and 2 Community Servers.
- Web meeting cluster with 2 Sametime Enterprise Meeting Servers and 3 Meeting Room Servers.
Figure 2
The following table lists the specs of the servers used in the new infrastructure
Machine | OS | Software | Specs | Comments |
EMS Node1 | Win2003 SP2 | EMS 8.0.1/WAS6.1.0.23/IHS6.1 | 2 dual cores @2.7 GHz - 4Gig Ram | IBM blade HS21 |
EMS Node2 | Win2003 SP2 | EMS 8.0.1/WAS6.1.0.23/IHS6.1 | 2 dual cores @2.7 GHz - 4Gig Ram | IBM blade HS21 |
EMS Database DB1 | AIX 5.3 | DB2 9.1.0.5 |
| HADR (1st) / HACMP in the future. |
EMS Database DB2 | AIX 5.3 | DB2 9.1.0.5 |
| HADR (2nd) / HACMP in the future. |
EMS ST Room Server RS1 | Win2003 SP2 | Domino 8.0.2 + Sametime 8.0.1 | 2 dual cores @2.7 GHz - 4Gig Ram | IBM blade HS21 |
EMS ST Room Server RS2 | Win2003 SP2 | Domino 8.0.2 + Sametime 8.0.1 | 2 dual cores @2.7 GHz - 4Gig Ram | IBM blade HS21 |
EMS ST Room Server RS3 | Win2003 SP2 | Domino 8.0.2 + Sametime 8.0.1 | 2 dual cores @2.7 GHz - 4Gig Ram | IBM blade HS21 |
Load Balancer 1 |
| Nortel Alteon 2208 |
| Load balancer
(80/443/1533. 389 and 636 in the future). |
Load Balancer 2 |
| Nortel Alteon 2208 |
| Load balancer
(80/443/1533. 389 and 636 in the future). |
LDAP Node1 | OES 2 SP1 | Novell eDirectory 8.8.4 | 4 dual cores @3.16GHz - 8Gig Ram | Customer LDAP Directory |
LDAP Node2 | OES 2 SP1 | Novell eDirectory 8.8.4 | 4 dual cores @3.16GHz - 8Gig Ram | Customer LDAP Directory |
ST Community Server CS1 | Win2003 SP2 | Domino 8.0.2 + Sametime 8.0.1 | 2 vCPU @1750 GHz* - 4 Gig Ram | * VM image running on IBM x3950 |
ST Community Server CS2 | Win2003 SP2 | Domino 8.0.2 + Sametime 8.0.1 | 2 vCPU @1750 MHz* - 4 Gig Ram | * VM image running on IBM x3950 |
ST Mux1 | Win2003 SP2 | Sametime 8.0.1 Mux | 1 vCPU @1750 MHz* - 1 Gig Ram | * VM image running on IBM x3950 |
ST Mux2 | Win2003 SP2 | Sametime 8.0.1 Mux | 1 vCPU @1750 MHz* - 1 Gig Ram | * VM image running on IBM x3950 |
(*) VMware ESX 3.5 used for virtualization.
4) Configuration tuning
4.1) Infrastructure description
The deployed Sametime infrastructure is represented in figure 3 :
- The configuration of all Domino/Sametime servers (RS1, RS2, RS3, CS1 and CS2) is managed by the EMS servers (i.e. when those servers start, they are getting their Sametime configuration from the EMS database and not from the stconfig.nsf domino database).
- using the EMS administration console, Community Services and Meeting Services have been segregated :
- To prevent Meeting Services activities from occurring on CS1 and CS2 servers.
- To enforce Community Services activities on CS1 and CS2 servers (defined as the users home cluster in the LDAP directory).
- DB2 databases of the DBx and DBy servers are synchronized using the DB2 HADR feature. The DBx database has been configured as the primary database. When servers EMS1 end EMS2 start they connect to the DBx server. In case of failure of the DBx server, a script is executed in order to declare the DBy database as the new primary database.
- Remote Applets download has been configured (download of the applets from the EMS IHS servers instead of the Domino servers)
This infrastructure has been installed following the procedures documented in the Sametime infocenter:
http://publib.boulder.ibm.com/infocenter/sametime/v8r0/index.jsp?topic=/com.ibm.help.sametime.801.doc/Home/super_welcome.html
Figure 3

4.2) Remote applet download
By default, in a Sametime Enterprise Meeting Server deployment, the MRC (Meeting Room Center) applet download is routed to a Domino Room Server. Once the applets are downloaded from the room server,ther are stored locally in the workstation JVM cache to optimize the network bandwidth for the next access to the meeting room center. The workstation JVM cache references the URL used to download the applets, i.e. the cached applets are valid only for this URL.
When one has multiple room servers managed by EMS, EMS does not route systematically the requests to the same room server. So even if the applets have already been stored in the workstation JVM cache, the user may download again the applets from another server and store another copy of the applets in the workstation JVM cache thus defeating the bandwidth-saving purpose of caching the applets in the first place !
The "remoting applet downloads" configuration described in the Lotus sametime infocenter allows one to avoid this issue.
- Applets are downloaded from the EMS IHS servers instead of the room servers.
- With load balancers, the same URL is used for all IHS servers.
But the "remoting applet downloads" configuration applies only to the MRC applets and not to the javaconnect applet of the "Sametime Connect client for browsers".
To remote also the Javaconnect applet downloads, we performed the following customization
a) Copy the JavaConnect files from the domino server to the IBM HTTP server
Copy the jvaconnect directory :
- from the ../Notes/data/domino/html/sametime/ Domino server directory
- to the ../IBMHttpServer/htdocs/en_US/sametime/ IBM http server directory
b) Modification of the webconnect.jsp file on the EMS servers
The webconnect.jsp file is located into the following EMS directory
../AppServer/profiles/
/installedApps/stwebconfCell01\/TCenter.ear\stcenter.war/content/subforms/ EMS server directory
For the following lines of the webconnect.jsp file, we change 'SametimeServerBaseHTTPURL' to 'AppletDownloadURL' :
- Line 39 <applet codebase="<stconfig:getStringValue shortName='SametimeServerBaseHTTPURL'/>
- Line 52 CODEBASE="<stconfig:getStringValue shortName='SametimeServerBaseHTTPURL'/>
- Line 57 <applet codebase="<stconfig:getStringValue shortName='SametimeServerBaseHTTPURL'/>
Lotus sametime infocenter:
http://publib.boulder.ibm.com/infocenter/sametime/v8r0/topic/com.ibm.help.sametime.801.doc/EMS/st_adm_ems_remoting_applets_t.html
4.3) DB2 tuning
Prior to the move of the infrastructure in production, we performed stress tests using the IBM Lotus Server.Load utility.
When running these stress tests, we noticed that the EMS system was blocking at a low level of concurrent users (~100 and often less). As the stress test was ramping up, response times got longer and longer, to over 2 minutes, and this happened despite an increase of the time lag between virtual user connections.
Analysis of EMS logs showed a number of errors related to the access of the DB2 database.
The problem was fixed by applying DB2 tuning parameters with the following DB2 commands :
a) Modify Lock Configuration
db2 update db cfg for sametime using locklist 10000 maxlocks 60 locktimeout 60
b) Increase Bufferpool Size
db2 alter bufferpool sametime immediate size 8000
c) Increase Sortheap size
db2 UPDATE DB CFG FOR SAMETIME USING SORTHEAP 2560
d) Increase Monitor Heap Size
db2 Update dbm cfg using MON_HEAP_SZ 660
e) Lock Related Registry Modification (To be turned on after upgrading to DB2 8.2 FP15 or later)
db2set DB2_EVALUNCOMMITTED=ON
db2set DB2_SKIPDELETED=ON
db2set DB2_SKIPINSERTED=ON
db2 terminate
f) AUTO_RUNSTATS
db2 update db cfg for sametime using AUTO_RUNSTATS on
4.4) Tuning LDAP connections and queries
While the the LDAP servers were responding to queries efficiently, CS1 and CS2 servers were permanently logging the following error messages.
- LDAP dd/mm/yy, hh:mm:ss Warning : Low efficiency [n] iterations
- LDAP dd/mm/yy, hh:mm:ss Warning : pending operations not responded within timeout
During normal operations it was transparent to end-users. But if one community server (CS1 or CS2) failed, crashed or was simply shut down, the other server was unable to accept users trying to reconnect (excessive queuing of LDAP requests on this Sametime server).
After weeks of investigation, we discovered 2 major issues :
- The VMware layer was degrading the response time for LDAP queries.
- There was a bottleneck at the level of the STResolve process on the CS1 and CS2 servers.
We made 2 changes to fix these issues :
- Optimization of the VMware layer
The CS1 and CS2 Virtual machines were sharing the same NICs (2 gigabit NICs) with dozens of other Virtual machines.
We took the decision to add dedicated gigabit NICs for each Community and MUX servers as shown in figure 4.
Once this change was done, the impact of the VMware layer on the response time to LDAP queries fell to only 20% when compared to physical machines. There were less error messages logged by the CS1 and CS2 servers, but there were still some anyway and the situation was not yet completely satisfactory.
Figure 4

- Tuning of the STResolve process
We applied a Hotfix (STResolve Hotfix RBLE-7RBPQ2) to the Sametime servers allowing multiple connections to the LDAP server from a single instance of STResolve (and other LDAP-related Sametime processes).
Over time we tuned the Directory settings of the Sametime servers by modifying the [Directory] section of the Sametime.ini and ended up with the following set of parameters on the CS1 and CS2 servers to optimize the various LDAP-related Sametime processes performances (including STresolve) :
[Directory]
ST_DB_LDAP_CONNECTIONS_NUMBER=5
ST_DB_LDAP_PENDING_LOW=30
ST_DB_LDAP_PENDING_MAX=60
ST_DB_LDAP_MAX_RESULTS=300
ST_DB_LDAP_MIN_WILDCARD=3
ST_DB_LDAP_KEEPALIVE_INTERVAL=5
ST_DB_LDAP_SSL_ONLY_FOR_PASSWORDS=1
The meaning of these setting is well described at http://www-01.ibm.com/support/docview.wss?rs=899&uid=swg21200143.
Once this change was applied, CS1 and CS2 servers stopped logging error messages related to poor LDAP efficiency.
Authors
V. Cailly - IBM Software IT architect (IBM /SWG)
X. Barros - IBM Project Manager (IBM /SO)
L. Guillaume - IBM Wintel & VMware IT architect (IBM /SO)